Cluster approximations for probabilistic systems: 
a new perspective of epidemiological modelling 
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Especially in lattice structured populations, homogeneous mixing represents an inadequate as- 
sumption. Various improvements upon the ordinary pair approximation based on a number of 
assumptions concerning the higher-order correlations have been proposed. To find approaches that 
allow for a derivation of their dynamics remains a great challenge. By representing the population 
with its connectivity patterns as a homogeneous network, we propose a systematic methodology for 
the description of the epidemic dynamics that takes into account spatial correlations up to a desired 
range. The equations which the dynamical correlations are subject to, are derived in a straightfor- 
ward way, and they are solved very efficiently due to their binary character. The method embeds 
very naturally spatial patterns such as the presence of loops characterizing the square lattice or the 
treelike structure ubiquitous in random networks, providing an improved description of the steady 
state as well as the invasion dynamics. 
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I. INTRODUCTION 

The spreading dynamics of an infectious disease is cru- 
cially determined by the population's underlying connec- 
tivity patterns. Followed by the renewed interest in graph 
theory witnessed by statistical physics in the recent years 
0, |3| , substantial progress has been achieved in the field 
of epidemiology. As an example, the scale-free degree 
distribution of the Internet accounts for the absence of a 
finite epidemic threshold in the spreading of a computer 
virus Ij, Q . The long persistence of HIV is due to the 
very same topological property of the web of human sex- 
ual contacts |5j , and only targeted immunization leads to 
a finite epidemic threshold [g • 

Most of these studies are based on the mean-field ap- 
proximation which represents a reasonable assumption 
for networks that exhibit large connectivity fluctuations 
(that is (fc^) oo where k represents the number of 
emanating links from a specific node, and the average 
is taken over the entire network). Quite often however, 
e.g. in lattice structured populations, spatial correlations 
become important and heterogeneous mixing has to be 
taken into account. Matsuda et al. first used the ordi- 
nary pair approximation for the treatment of a biological 
problem, and the resulting improvements are consider- 
able. In addition to the analytical tractability of this ap- 
proximation, simple estimates for the epidemic threshold 
can be obtained in terms of a "dyad heuristic" : a condi- 
tion for the location of the critical point is elaborated by 
looking at two neighbouring infected sites, comparing its 
recovery with the infections it gives rise to Q . 

Various extensions upon the standard pair approxima- 
tion have been proposed. In order to analyze the propa- 
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gation of a wavelike invasion in a lattice structured popu- 
lation, a remarkable improvement is brought about if the 
region occupied by the infected individuals is described 
by the ordinary pair approximation, and the leading edge 
of the wave front is modelled by the quasi-steady-state 
pair approximation Bauch and Rand developed a 
pair model for the situation where the population's un- 
derlying connectivity patterns are of a dynamical nature 

The grid-like structure or the presence of triangles are 
topological properties which both the mean-field and the 
standard pair approximations do not account for. An 
approach pursued by different authors is to use param- 
eters that characterize the topology (such as the den- 
sity of triangles) and to make a number of assumptions 
about the corresponding higher-order correlations, which 
leads to improved pair models. Van Baalen illustrates 
this method for the triangular and square lattices, if the 
higher-order correlations are set to 1 11]. The invasion 
dynamics is reproduced very accurately. The same strat- 
egy can be explored for less homogeneous networks |12| |. 
and the consequences regarding epidemiological invasions 
have been discussed in detail |l3|. The improved pair 
approximation JA, takes into account the clustering 
property of lattice models more precisely. Its key ingre- 
dient is to make less restrictive assumptions about the 
higher-order correlations, for example they can be set to 
a value not equal to 1. 

Morris derived the dynamics of higher-order correla- 
tions in the usual vein, that is an equation which deter- 
mines the time evolution of an average quantitiy is used 
as point of departure [T^ . 

In this paper, we introduce a novel method for the 
description of the epidemic dynamics which takes into 
account spatial correlations up to a desired range. The 
methodology is illustrated for the case where the pop- 
ulation's underlying connectivity patterns is given by a 
homogeneous network. Concerning the local contact pro- 
cess, we use a susceptible-infected-susceptible model in- 
volving transition rates between the two possible states 
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(e.g. [13 )■ The formalism is elaborated in discrete time, 
and the continuous-time dynamics arises as a limiting 
case. This limit has been performed in order to allow for 
a comparison with the above sketched approaches. 

The paper is organized as follows. In Section II, the 
adopted model involving the local dynamics as well as the 
selected geometries is described in detail. Section III re- 
views the mean-field and pair approximations, introduces 
our formalism and explains how these approximations are 
recovered. In Section IV, the method is illustrated for a 
random homogeneous network, the triangular and square 
lattices. Section V offers a discussion of the results as well 
as some suggestions for further investigations. 



II. THE MODEL 

Our approach conceives the population as a network, 
with connections between individuals that do not change 
in the course of time. Each node of the network rep- 
resents an individual, and every link symbolizes a rela- 
tionship between individuals that involves repeated con- 
tacts, and therefore the transmission of an infective agent 
proceeds along connections. Real social networks ex- 
hibit rich degree distributions, that is vertices can differ 
considerably in the number of nearest neighbours. As 
the aim of this paper is the introduction of a methodol- 
ogy that systematically takes into account higher-order 
correlations, we adopt the simple susceptible-infected- 
susceptible model and focus on networks where every 
node has the same number of nearest neighbours. De- 
spite the homogeneity of these graphs, there exist several 
classes of such networks differing in topological proper- 
ties beyond the degree distribution. We shall oppose the 
regular square lattice to the case where the underlying 
social structure is fully random, furthermore our approx- 
imation scheme is illustrated for a triangular lattice. The 
generalization to the SIR- or SEIR-models, where the in- 
dividuals can be in 3 or even 4 possible states, is straight- 
forward. 

A homogeneous random network of degree K and size 
N is constructed as follows (Fig. To each of the N 
vertices, K ends of edges are attached. The free ends are 
then connected at random. 

The time evolution of the states of the vertices are 
given by the following rules. Infected nodes recover spon- 
taneously at a rate 5. On the other hand, an infected in- 
dividual can infect any of its K nearest neighbours at a 
rate v. Because what matters is the ratio of the transmis- 
sion and recovery rates, we can reduce the number of pa- 
rameters by rescaling the time unit. Thus without loss of 
generality, the local dynamics is determined by the immu- 
nization rate 1 and the effective spreading rate A — v jb. 
In section III, we will elucidate how this continuous-time 
model is recovered as a limiting case from a more general 
discrete-time description. 




FIG. 1: Construction of a homogeneous random network of 
degree A' = 4 and size A'^ = 16. Tlie nodes are connected 
randomly and its number of emanating edges are constrained 
to be if = 4. 



III. REVISITING THE MEAN-FIELD AND 
PAIR APPROXIMATIONS 

In this section, we first review the mean-field and stan- 
dard pair approximations. These descriptions are ob- 
tained by a rate equation which determines the time 
evolution of some average quantity such as the density 
of infected individuals or the density of pairs of infected 
individuals. Up to the level of pair correlations, this is 
indeed a reasonable approach. But if one wants to keep 
track of higher-order correlations (for example the den- 
sity of plaquettes of four infected nodes in the case of the 
square lattice), a more general starting point may reveal 
to be advantageous. In subsection B, we derive an exact 
description of the epidemic dynamics and show how the 
mean-field and standard pair approximations are recov- 
ered in a rather automatic way in part C of this section. 
The various higher-order approximations are elaborated 
in the following section. 

A. Conventional Approach 

The rate of change of an average quantity / (such as 
the fraction of sites in a particular state) is described as 

/-EE Ke.)(/e. -/), (1) 

where X is the set of all sites, and represents the 
set of all events that can occur at x. A particular event 
changes the average from / to /e^ and occurs at rate 
r(e,) [13. 



The SIS-model allows for two possible states, namely 
susceptible (0) and infected (1). At the mean-field level, 
the dynamics is described in terms of the density of 
infected individuals pi, and the fraction of susceptible 
nodes obeys po = 1 — pi. Eq. |^ translates into 



Pi = -pi + XKpopi. 



(2) 



The first term accounts for infected nodes becoming 
healthy whereas the second term describes the new in- 
fections, fully ignoring pair correlations. 

In the framework of the standard pair approximation 
0, the dynamics is described in terms of the doublet 
densities pxy {x,y G {0, 1}), this quantity corresponds to 
the probability that a randomly chosen pair is in con- 
figuration {xy). They are related to the global densities 
Px and local densities (conditional probabilites) Px\y by: 
pxy = Pyx = PxPy\x = PyPx\y The global and local den- 
sities satisfy 

1 1 
^ = 1 and ^ px\y = 1 for any y e {0, 1} 

x=0 x=Q 

Eq. tells that the density of infected individuals and 
the doublet density pn evolve in time according to 



pi = ~pi + \KpQ\ipi 



Pii 



-2pii -I- 2Apio + 2\{K - l)pi|oi/5io. 



(3) 



V. 



FIG. 3: Epidemic spreading on homogeneous networks of de- 
gree 4. The average number of infected individuals p (preva- 
lence) as a function of the inverse infection rate 1/A in the 
steady state is shown. The simulations on the square lat- 
tice (squares) and random homogeneous network (circles) 
exhibit higher epidemic thresholds with respect to the ap- 
proximations. The mean-field description (dotted line) yields 
Ac = 1/4 whereas the pair approximation (dashed line) leads 
to Ac = 1/3 for the epidemic threshold. The latter is also in 
better agreement with the simulation results for 1/A — > 0. 
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FIG. 2: An arbitrarily chosen link and its nearest neighbour- 
hood within a homogeneous network characterized by the de- 
gree distribution P{k) = Sk4- The dashed lines indicate the 
connections which are present in the case of a square lattice. 

The first of Eqs. (0) can also be regarded as the re- 
sult of substituting po by po|i in Eq. |2Jl, i.e. the sus- 
ceptible node that is to be infected has to be a nearest 
neighbour of the vertex which will transmit the infective 
agent. The second of Eqs. (O includes a recovery term 
[the first term on the right hand side, destruction of (11)- 
pairs] and transmission terms [the second and the third 
terms, creation of (ll)-pairs]. The first term describes 
transitions of pairs in state (11) to either (10) or (01). 
Both transitions occur at rate 1 (the recovery rate) and 
thus give rise to the factor 2. The factor 2 in the second 
and the third terms is needed because we do not assume 



any asymmetry between sites, which means pio = poi- 
A (ll)-pair can be created from a (lO)-pair either if the 
infective agent proceeds along the connection within that 
pair (second term) or if the susceptible node is infected 
by one of the other K — 1 nearest neighbours of it (third 
term, see also Fig.|21l. This path involves the conditional 
probability Piioi [i-e. the probability of finding an infected 
node adjacent to a (Ol)-pair] which is approximated by 
Pi|o as in the ordinary pair approximation, only nearest- 
neighbour correlations are taken into account. In order 
to solve the Eqs. (j2Jl, the system has to be closed. The 
set pi, Pi|i is a suitable choice, but pn, pio works equally 
well. 

Fig. El contrasts the solutions of Eqs. l|2Jl and © with 
the simulations for two different homogeneous networks 
of degree if = 4, i.e. the square lattice and the one in- 
troduced in Fig. ^ The pair approximation provides a 
rather good description of the equilibrium dynamics on 
top of a random homogeneous network, whereas the de- 
viation from the simulation result is remarkable if the 
population is arranged on a square lattice whose topol- 
ogy is characterized by the presence of many loops of 
short length. 

We shall now develop a more general formalism that 
will serve as a starting point in order to investigate the 
role of correlations beyond the pair level. 
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B. Exact Description 

In order to arrive at a more general point of departure 
which will allow us to investigate the role of higher-order 
spatial correlations, we shall describe the system by as- 
signing a probability 7^t(x) to every possible configura- 
tion X at a given time t where each of the x'^s can be 
either (susceptible) or 1 (infected). This probability is 
subject to 

X 

at every instant of time. The above introduced SIS-model 
yields the following transition probabilities for the possi- 
ble events that can occur at an arbitrary site I, involving 
a discrete time step At 

Wl^, = At Wl,^, = Hil - XAtyj) 

jnnl 

Wl^,^l-At W^^,^l-l[{l-\Atyj), 

jnnl 

where the products have to taken over the nearest neigh- 
bours of site I. By using the binary variables xi and yi, 
the above expressions are summarized as 



W'^ XI + {1-2x1 



Atyi + {l-yi)Y[il-\Aty,) 

jnnl 

(4) 

If the total number of nodes is denoted by N, the tran- 
sition probability that the system changes from configu- 
ration y to X can be written as 



N 



1=1 



(5) 



and on an exact level, the epidemic dynamics is governed 
by 



^t+At(x) = EWy. 



xPt(y) 



(6) 



with Wy^x given by Eq. ((SJ. Eq. ^ will serve as start- 
ing point for various approximations, be it in discrete or 
continuous time. In the latter case, only the terms up 
to order 1 in At have to be taken into account, but this 
limit shall be carried out later on. As most of the existing 
methods are formulated in continuous time, we will elab- 
orate the approximations for this case in order to allow 
for a comparison. 



C. Derivation of the Mean-Field and Pair 
Approximations 

Within this subsection, it is shown how the approxima- 
tions (|2Jl and ||3Jl are recovered from the exact description 



At the mean-field level, the dynamics is expressed in 
terms of the density of infected individuals. This quantity 
corresponds to the probability that an arbitrarily chosen 
site i is in state Xi — 1. In order to derive its time 
evolution, we sum Eq. ^ over all possible configurations, 
Xi held fixed 

V 

The left hand side of the above equation is Pt+Atixi), i.e. 
the probability that site i is in state Xi at time t + At. 
The mean-field approximation consists in considering the 
sites as independent from each other, i.e. 



N 



Vt{y) = llPt{yi), 



(8) 



1=1 



which corresponds to the homogeneous mixing hypothe- 
sis. Performing the summations, we find for Xi = I 

Pt+At{l) = 1 - AtPt(l) - Pt(0)[l - XAtPtil)f (9) 

whose continuous-time limit (At 0) is 

P(l) = -P{1) + XKP{Q)P{1), 

which is easily identified with Eq. Q since P(l) — pi 
and P(0) = po- 

Let us now see how the pair approximation is obtained 
by using our formalism. For this purpose, we sum Eq. 
© over all possible configurations, xa and xb held fixed, 
where A and B are the two sites of an arbitrarily chosen 
pair 

(10) 

The left hand side of the above equation corresponds to 
the probability that the pair AB is in state {xax.b) at 
time Ai, which shall be denoted by Pt+At{xAXB)- By 
adopting the enumeration introduced in Fig. [21 we ob- 
tain from Eq. Q for the transition probability (j/a2/b) ^ 
{xa^b) 

=TATR 

yA-*XA VB^XB 'Ji'a 

+At(l - 2xA)[yA - A(l - yA)(yB + yi + y2 + yz)]TB 
+At{l - 2xB)[yB - A(l - yB){yA + yA + y^ + y&)]-TA 

(11) 

where the linearization has been carried out at this point 
due to technical convenience and 

n = Ti{xi,yi) =Xi + {l- 2xi){l - yi), (12) 
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an abbreviation which will also be used below. Further- 
more the expression (jllll only involves state variables yi 
where i is either A, B or one of its nearest neighbours. 
The sum over the remaining yj is therefore carried out 
trivially. Taking into account correlations up to range 2 
only, we write for the probability that the pair AB and 
its nearest neighbours are in given states 

yi Vi 

Pt I y2 yA yB ys \ = PtiyAVB) 

2/3 ye 

X Ptiyi\yA)Ptiy2\yA)Pt{y3\yA) 

X Ptiy4\yB)Pt{y5\yB)Ptiy6\yB)- (13) 

The conditional probabilities in the above ansatz are ex- 
pressed as 



Piy^\yA) 



P{yiVA) 

PiVA) ' 



where P{yA) = X]x=o ^(-^y^)- Using this ansatz and 
performing the remaining summations, the continuous- 
time limit of Eq. (fTn|l leads to the system (for general K) 



P(00) = 2P(10) 



1-A(if-1 



P(00) 



P(0) 

P(10) ^ P(ll) - P(10) + AP(IO) 



2(X-1)«-X 



P(0) 



P(ll) = -2P(11) - 2AP(10) 



^ ' P(0) 



(14) 



By identifying the pair probabilites P(xy) with the dou- 
blet densities pxy and since pml Pa = 1 ^ Pw/ Po, the 
system of Eqs. (I14II corresponds to Eqs. (PJ. 

In summary, in the standard derivation of the mean- 
field and pair approximations based on Eq. (QJ, the rate 
of change of an average density is directly expressed by 
all the different events that can alter its value in a rather 
heuristic way [Eqs. ^ and ©]. On the other hand, the 
derivation of the approximations becomes an automatic 
procedure involving 

• an initial summation of the system probability 
7't+At(x) over all possible states except a few in 
order to obtain Pt+At{x) or Pt+\t{xAXB) [Eqs. Q) 
and l|TU|)]. 

• an ansatz corresponding to the approximation [Eqs. 
I© and 1131)], 

• and the continuous-time limit. 

However, the last step is not really imperative. Our 
methodology works equally well in discrete time. If At is 
set to 1, A At = A then corresponds to a probability rather 
than to a rate and higher-order terms in A appear in the 
equations. As an example, the discrete-time evolution at 



the mean-field level is governed by Eq. Obviously, 
the results quantatively differ from the continuous-time 
limit. The full advantage of this formalization will be- 
come clear in the next section. 

It is also important to note that topological properties 
beyond the degree distribution do not enter at the level 
of the standard pair approximation. In the case of the 
square lattice, the nodes 1 and 4 as well as 3 and 6 (Fig.|21) 
are also connected whereas these links are missing in its 
random counterpart. Various improvements upon the or- 
dinary pair approximation have been proposed. Instead 
of deriving the higher-order correlations from the dynam- 
ics of the system, these pair models consist in making a 
number of biologically motivated assumptions involving 
parameters that characterize the topology of the underly- 
ing network. We shall compare our approach with these 
improved pair models in the next section. 



IV. FURTHER SYSTEMATIC IMPROVEMENT 

The difference between the simulation result and the 
pair approximation in Fig. |31 is rooted in the neglection 
of correlations of range greater than 2. Especially in the 
vicinity of the phase transition, where a finite fraction 
of the nodes starts being infected, the links should not 
be considered independently, and higher-order dynami- 
cal correlations have to be taken into account. In other 
words, the state Xi of node i at time t -|- At is determined 
by all the states of its nearest neighbours of node, i.e. 
it is not the case that the states of the various nearest 
neighbours at time t contribute independently from each 
other to the state Xi at time t -I- At. 

We therefore want to incorporate the longer correla- 
tion range by extending the fundamental cluster (site, 
pair) to a star or square, respecting the underlying net- 
work's topology. Therefore different spatial patterns are 
embedded very naturally by our method. The dynami- 
cal equations, to which the higher-order correlations are 
subject to, are derived in a very straightforward way by 
our formalism. The binary nature of these equations al- 
lows for a very efficient solution by the computer. On 
the other hand, the equations can be simplified further 
by taking into account the underlying symmetries. This 
latter procedure will be illustrated for the triangular and 
square lattices. Performing this extension, we find an 
improved description of the steady state as well as the 
dynamics. 

Alternatively, it is possible to derive the dynamics of 
triple correlations by using Eq. Although this ap- 
proach has the advantage that no specific cluster must 
be chosen, it is a rather difficult undertaking . 



A. Homogeneous Random Network 

Random networks are characterized by a vanishing 
clustering (local interconnectedness), but the average dis- 
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FIG. 4: An arbitrary node (denoted by 0) with its corre- 
sponding starlike fundamental cluster within a homogeneous 
random network of degree K = 4. 



tance between any pair of nodes only increases logarith- 
mically with the system size: this is known as the small 
world phenomenon p^ . It is easy to imagine that the 
epidemic spreads the more rapidly the smaller the un- 
derlying "world" is. 

As the local topology is fully treelike, we shall use a 
star as our fundamental element. In contrast to regular 
lattices, this extension is a unique choice. Fig. 21 shows 
an arbitrarily chosen node in a homogeneous random net- 
work and two hierarchies of its nearest neighbours, also 
introducing the notation which is adopted below. 

The probability that, at time node is in state 
xq and its nearest neighbours 1,2,3,4 are in the states 
{xi,X2,X3,X4} is denoted by 




and obtained by summing 7't(x) over all possible config- 
urations, {xo,xi,X2,X3,X4} being fixed. 

The probability that the nearest and second-nearest 
neighbours of node are in given states, is given by the 
ansatz (the sum over the remaining y-states is again per- 
formed trivially) 

Pt{{yj}jeAf) Pt \yi yo ys {[Ptiynyisynlyiyo), 
\ yi / ;=i 

(15) 

where Af represents the set of nodes depicted in Fig. 0] 
and the conditional probabilities are 

yi4 

Pt I yi3 yi yo 

Pt{yi2yimi\ym) = 57 — r 

Ptyyiyo) 

The pair probability appearing in the above expression 



is extracted from the corresponding star probabilities by 

111 ( yiA 
Pt{ym)^ X! X! ^Pt\yi^ vi yo 

yi2=o t/i3=o t/i4=o \ yi2 





FIG. 5: Equihbrium prevalence of the epidemic process on 
a homogeneous random network with P{k) = 8^4. The star 
approximation (solid line) is in excellent agreement with the 
simulation result (circles), yielding also an accurate descrip- 
tion of the critical region. The mean-field (dotted line) and 
pair approximation (dashed line) have been plotted again for 
comparison. 

With these ingredients, the continuous-time limit of 
Eq. injl reads 



n - 2x,)[t/, - A(l - y,) J2 yj] U ^k \] 

i—0 I jnni k^i J 

(16) 

with P{{yj}j^_\f) given by Eq. ((TK|) . The binary charac- 
ter of this system of 2^ = 32 equations permits a very 
efficient numerical implementation. On the other hand, 
if one takes into account the symmetries of the problem, 
the degrees of freedom can be reduced to 10, but this 
procedure will be shown for the regular lattices. Fig. [51 
shows the striking agreement of the star approximation 
with the simulation result, all along from a high effective 
spreading rate to its threshold value, for the equilibrium 
situation. Fig. |^ opposes the various approximations to 
the stochastic simulation for the case of the invasion of 
an infective agent, the initial prevalence being set to 0.01. 
Whereas the steady state is reached rather quickly in the 
mean-field description, the slope of the star approxima- 
tion is in remarkable agreement with the simulation. As 
correlations of a greater range are taken into account, it 
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probability that the corners of the square ABCD are in 
the states {xa, xb,xc,xo\ at time t is deduced from the 
system probabihty by 



Xa Xb 
Xd xc 



{xi}i,f{A,B,C,0} 



If the nearest neighbours of the vertices A, B, C and D 
are enumerated according to Fig.[7| we write for the prob- 
abihty that the nodes comprised within these 5 squares 
(i.e. the nearest neighbours of the central plquette) are 
in given states 



FIG. 6: Invasion of an infective agent (infection rate A — 3/7) 
in a population whose connectivity patterns are given by a ho- 
mogeneous random network. At the mean-field level (dotted 
line), the initial prevalence of 0.01 increases to its equilib- 
rium value during 10 time units only. The pair approximation 
(dashed line) provides a further improvement, and the star ap- 
proximation (solid line) is in remarkable agreement with the 
stochastic simulation. 



X Pt{yiy2\yAyB)Pt{ym\yByD) 

X Pt{y5ye\ycyD)Pt{y7ys\yAyc) (17) 

with 



can also be observed that the system equilibrates more 
smoothly, that is p{t ~ 30) for the star approximation is 
considerably smaller than the rate of change of p at time 
t ~ 10 at the mean-field level. 



B. Square Lattice 

1 2 
• 



D 



FIG. 7: An arbitrarily chosen square within a 2-dimensional 
lattice and the denotation of the nearest neighbours of its 
corners. The former serves as the fundamental element within 
the square approximation. 

In contrast to random graphs, the epidemic dynamics 
on top of this regular network is essentially dominated 
by the presence of loops. In order to arrive at a level 
of description beyond the pair approximation, we shall 
use the square as our fundamental cluster. This seems 
to be a natural choice, although it is not unique as dis- 
cussed below. In analogy to the previous subsection, the 



Pt 



PtiymlyAys) 



yi y2 
yA ys 



PtiyAys) 



involving the pair probability 



PtiyAyB) = E E ^* 



Xl—O X2—0 



Xi X2 

yA ys 



and analogously for the other factors appearing in H17|l . 
At this point, we could again write down an equation of 
the type ifTCI) . but the 2^ = 16 plaquette probabilities are 
subject to various symmetries, which we will denote by 



Pt 



Pt 



Pt 



<lo,t 



= Pt 



^Pt 



Pt 



^Pt 
= Pt 



= Pt 



= Pt 



q2.t 



Pt 



Pt 



qi,t 



= 4t 



93,t 



94, t 



The exact description ijHJl leads to the following 
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to be noted that the computational load of the square 
approximation in this form is reduced dramatically with 
respect to its raw form [analogous version of Eq. Ijlfijl ]. 

Fig. |S1 shows the systematic improvement brought 
about by the square- and the bisquare approximations in 
dynamical equilibrium. The latter is a description whose 
fundamental cluster is composed of two squares. Its pre- 
diction of the epidemic threshold (Ac — 0.38) is still lower 
than the simulation result (Ac — 0.41): this highlights 
the crucial role of the higher-order spatial correlations 
in lattice structured populations. Fig. ^ represents the 
improvements upon the dynamics. Note that from a cer- 
tain characteristic time the simulation lags behind all the 
approximations as a direct consequence of the stochas- 
ticity which is particularly important at low prevalences. 
However this characteristic time is shifted to the right 



FIG. 8: Equilibrium prevalence in a square lattice structured 
population. The mean-field (dotted line) and pair approxi- 
mations (dashed line) are levels of description at which topo- 
logical properties beyond the degree distribution do not en- 
ter. The approximations involving the square (solid line) and 
a rectangle composed of two squares (dashed-dashed-dotted 
line) as fundamental units are shown to yield a systematic 
improvement of the steady-state behaviour. 



continuous- time dynamics for these quantities 

qo = 4gi - SXTiQq 

+ q2 + A[-2gi(l + 2Ti + T2) + 2rigo] 
f 2^3 + A[-4g2'^ + 2ri(go + iqi) + ^iqi - q^)] 
f 2^3 + \{-Aq^ + mqi - 4:T2q^) 
- qi + A[2gi + 4g^ - 4^3 - 2ri(go + 2(7i) 



qi 

q^ 

f2 



-qi + 

-2qt 
-2q§ 



q-i = -3^3 -+ 

+ m{qi 

qi = -4^4 4 
where 



2q^ + 2q§)] 



X[8q§ 



16q3 - 8T2{qi 



q^ 



q?) 



(18) 



and T2 



Pi 



involving the following triplet- and pair probabilites given 
by the square probabilities through 



= qi 



qt 



1 



= q^' 



93, 



Pt 



P(00) 
P(10) 



90 

qi 



2qi 
q^ 



-q^ 
q^- 



and 



93- 



Since 



go + 491 



2q^ 



4^3 + 94 = 1, 



the square approximation in the form (|18|) represents a 
dynamical system of 5 degrees of freedom. It also has 




FIG. 9: Epidemic dynamics in a square lattice structured pop- 
ulation (transmission rate A = 1/2). By taking into account 
correlations of a greater range, the slope during the transient 
time decreases as a comparison of the mean-field (dotted line), 
pair (dashed line), square (solid line) and bisquare approxi- 
mations (dashed-dashed-dotted line) shows. The diflerence 
between the simulation result and the bisquare approxima- 
tion remains significant during the invasion period due to the 
considerable effect of random events at overall low prevalence. 

An improvement upon the standard pair approxima- 
tion can also be obtained as follows • Instead of de- 
riving the square probabilities from the dynamics of the 
system, one can write it as 



P{7 7 = P{x,)PiXa)Pixi,)Pix,)aaCabCbjC,,Taabj 
\Xj XbJ 

involving the relative pair and square correlation factors 
Cxy and Tui^iyj. For a straight triple, it is supposed 

P{xiXaXb) = P{Xi)P{Xa)P{xb)CiaCabT^iab- 

By setting the relative correlation factors T^j^bj and 
Tziab to 1 and using the fact that on the square lat- 
tice 1/3 of the triples are straight and 2/3 form part of 
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6 5 



FIG. 10: An arbitrarily chosen triangle and its nearest neigh- 
bourhood. The dashed lines indicate that the corresponding 
hnks are ignored. 



a square, one obtains an improvement for P(xi\xaXb)- 
In other words, the conditional probability P{xi\xaXb) 
is not simply set to P{xi\xa) as it is done in the ordi- 
nary pair approximation, but rather the loop structure 
is incorporated while still using pairs as building blocks. 



C. Triangular Lattice 

In its ordinary formulation, the fact that two sites can 
have neighbours in common, is simply ignored by the 
pair approximation. By means of the triangular lattice, 
we show how the method introduced in this paper has to 
be applied, i.e. what the next level of description beyond 
the pair approximation is. 

The clue is to use the triangle as the basic element. In 
analogy to the previous cases, the probability that the 
vertices of a triangle ABC are in the states {xa, xb,xc} 
at time t is obtained through 

Fig. ^Ishows the neighbourhood of an arbitrarily chosen 
triangle within this lattice. For the probability that the 
vertices depicted in Fig.^|are in given states, we suppose 

-Pt({yJje{A,B,c,i,2,...,9}) = 

x^Pt{yi\yAyB)Pt{y4\yByc)Pt{y7\ycyA) 
>^Pt{y8y9\yA)Pt{y2y3\yB)Pt{y5y6\yc)- 

The conditional probabilities appearing in the above ex- 
pression can be written as fractions involving site- and 
pair probabilities. The latter are deduced from the tri- 
angle probabilities in analogy to previous explanations. 



As the triangle correlations are subject to the symmetries 

a further simplification can be performed, and finally 
the continuous-time triangle dynamics is governed by the 
equations 

to ^ 3[ti ~ 2X{Ai + A2)] 

ii = -ti + 2t2 + 2A(-2ti + 3Ai + 2A2 - 2A3 - A4) 
<2 = -2t2 + h + 2\{ti - 3t2 - 3^1 - A2+ 4A3 + 2^14) 
U = 3[-<3 + 2A(ti + 3i2 + Ai- 2A3 - Ai)]. 

(19) 

where 

, Pito . toil 

Ai = , A2 — 

So Po 

A,^P^ and A,= '-^ 
So Pi 

depending on the pair probabilities pi — ^"(10) = <i +t2, 
Po = P(00) = to + ti and the site probability so = P{0) = 

to + 2ti+t2. 

Because of the constraint 

to + 3ti + 3t2 + h = 1, 

we have three degrees of freedom in the triangle approx- 
imation (fT^ . 

As far as the equilibrium prediction is concerned, the 
triangle approximation provides a very good description 
for 1/A < 3 fFig. Ill|l . The difference between its thresh- 
old prediction (1/Ac — 4.5) and the simulation result 
(1/Ac — 3.9) is of the same order of magnitude as the 
plaquette approximation in the case of the square lattice. 
Concerning the dynamics fFig. ll2|) . we also observe a lag 
between the simulation and the approximations, and the 
slope during the transient time is slightly improved as 
one goes from the pair to the triangle approximation. 

The strategy outlined at the end of the last subsec- 
tion can also be applied to the triangular lattice [Tl| . 
In addition to the open triplet probability, the triangle 
probability is written as 

P( ) ^ P{Xi)P{Xa)P(xt,)C,aCabT/^M- 

\XaXbJ 

One then obtains an analogous correction for P{xi\xaXb) 
involving a parameter 9 denoting the fraction of triplets 
in closed form which is 2/5 in the triangular lattice. 
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FIG. 11: Steady-state prevalence in a triangular lattice struc- 
tured population. The mean-field description (dotted line) 
yields an epidemic threshold Ac = 1/6. With respect to the 
pair approximation (dashed line), the description based on 
the triangle (solid line) provides a better approximation of 
the simulation result (triangles). 



Interestingly, the simplest elaboration of this approach 
(TAmb = Tziab = T^ab) reproducBS the invasive period 
reasonably accurate if 9 is chosen larger than its correct 
value {9 ~ 0.6). Keeling et al. and Rand also developed 
improved pair models based on this approach [13, • 



0.5 - 




t 



FIG. 12: Invasion dynamics in a population represented by a 
triangular lattice (A = 1/3). The upper two curves show the 
mean-field (dotted line) and pair dynamics (dashed line). The 
improvement brought about by the triangle approximation 
(solid line) still lags behind the simulation result due to the 
same reason as in the case of the square lattice. 



V. CONCLUSION 

We have studied a dynamical model of epidemic 
spreading where every individual is in contact with an 
equal number K of nearest neighbours. Infected nodes 
recover spontaneously at a rate 1, on the other hand, they 
infect neighbouring susceptible sites at a rate A. We have 
chosen this simple SIS-type model since the focus of this 
article is the introduction of a novel methodology that al- 
lows a rather straightforward derivation of the dynamics 
of higher-order correlations. 

The method we propose here consists in choosing a fun- 
damental cluster composed of a certain number of nodes 
n as well as links connecting them. A definite probability 
is assigned to each possible configuration of the basic ele- 
ment. The size of the fundamental cluster represents the 
range up to which spatial correlations are exactly taken 
into account. At a level beyond the pair approximation, 
the choice of the basic element is guided by the under- 
lying network's topology. In the case of the square lat- 
tice, clusters composed of at least one plaquette serve as 
the fundamental element; for random networks the local 
treelike structure is incorporated by using the star as the 
basic unit. Spatial patterns beyond the degree distribu- 
tion are therefore embedded in a very natural way by our 
method. Describing the epidemic dynamics of the entire 
population as a discrete time Markovian process, the ap- 
pearing probabilities (probability that a cluster and its 
nearest neighbourhood is in a given configuration) are ex- 
pressed in terms of the fundamental cluster probabilities. 
The continuous-time dynamics emerges as a limiting case 
{At-^0). 

With respect to the ordinary (rather heuristic) deriva- 
tion of the mean-field and pair approximation, these de- 
scriptions are derived with the help of our formalism by 
using the site or the pair respectively as fundamental 
clusters in a very automatic way. Independently of the 
specific choice of the cluster, the binary character of the 
resulting equations allows for a very efficient solution by 
the computer. Likewise, a further simplification can be 
reached if the symmetries which the fundamental clus- 
ter probabilites are subject to, are taken into account. 
As soon as correlations of range greater than 2 are not 
ignored, our method yields improved estimates for the 
location of the phase transition. In the case of the ran- 
dom network, the star approximation already leads to an 
excellent description of the steady state and the transient 
dynamics. In the regular counterpart, many squares have 
to be included within the corresponding fundamental 
unit in order to attain the same level of accuracy. This is 
due to the presence of stronger correlations caused by the 
high local networking. The method was also illustrated 
for a triangular lattice and contrasted to approaches that 
make a certain number of assumptions about the higher- 
order correlations which lead to improved pair models. 

However the novelty of the present work lies in the 
formalism which essentially consists in a more general 
starting point and its associated systematic improvabil- 
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ity rather than the specific results for the selected epi- port, 
demiological model and geometrical examples. 
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